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A rigorous derivation is provided for canonical correlations and partial canonical correlations 
for certain Hilbert space indexed stochastic processes. The formulation relies on a key congru¬ 
ence mapping between the space spanned by a second order, 7^-valued, process and a particular 
Hilbert function space deriving from the process’ covariance operator. The main results are ob¬ 
tained via an application of methodology for constructing orthogonal direct sums from algebraic 
direct sums of closed subspaces. 
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1. Introduction 

Canonical correlation analysis (CCA) is one of the principal tools for studying the re¬ 
lationship between two random vectors in multivariate analysis. There have now been 
several attempts to widen the definition of CCA to include vectors of infinite length 
and, more generally, stochastic processes (see, e.g., Eubank and Hsing [8] and references 
therein). Functional canonical correlation falls into this latter category wherein one ob¬ 
tains data that represent the sample paths of continuous time processes. In this paper we 
provide a framework for canonical correlation and partial canonical correlation analysis 
for a class of stochastic processes that includes those arising in functional data. 

A somewhat general formulation assumes that we have a probability space {Ul,A,P), 
a real, separable Hilbert space H, with norm and inner product || • || and (•,•) and an 
H-valued random variable X in the sense of Laha and Rohatgi [15]; that is, A : U —>■ "H 
is a measurable function relative to the Borel cr-field generated by the class of all open 
subsets of TL. Our attention will be restricted to random variables with E||A|p < oo with 
expectation being relative to P. Associated with such a random variable we can define 
the Hilbert space indexed process 


Z{f) = {X,f) (1.1) 

for f G TL. Then, from Vakhania et al. [20] there exists a mean element h gTL and a 
covariance operator S such that E[(A, /)] = {h, f) and E[(A — h, f){X — h, /')] = (/, Sf) 
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for all /, f S H. For simplicity, we assume that ||/i|| = 0. In that case, the covariance 
operator is determined by 

E[(X,/)(X,/')] = (/, 5/')- (1-2) 

It is well known that S in (1.2) is a trace class operator and therefore admits the 
eigenvalue-eigenvector decomposition 


S = '^Xj(j)j(8)(j)j, (1.3) 

i=i 

where Ai > A 2 > • • • > 0 are the eigenvalues, (j)j is the eigenvector associated with Xj and 
(/ ® g)h = (/, h)g for f,g,hG'H. A suitably normed version of the range of S gives us 
the reproducing kernel Hilbert space 


nS) = lf: / = ^A,/,</,,,||/||^(s) = E^^/i (1-4) 

I i=i i=i J 

that includes % as a, proper subset when S is not finite dimensional which we hereafter 
assume to be the case. The reproducing kernel Hilbert space recasts the range of S under 
a weaker norm where S is invertible, since the Picard condition (Engl et al. [7]) 


00 Ip I V 9 00 

1:^=1: vf 


1 

J=1 J 


< 00 


i=i 


is satisfied for / G 7^(5'). Eor each / G 77(5') there corresponds a random variable 


j=i 

These types of random variables are well defined and include those in the process (1.1) 
as a special case. Thus, for inferential purposes we can focus on the Hilbert space 


{ 00 00 ^ 

!/>,}: ||Z(/)||^.^:=Var(Z(/)) = ^A,/J<oo (1.5) 

i=i J 

which consists of all the linear combinations of the {X,(j)j) that have finite variance. 
Note that in addition to serving as an index set, 'H{S) is isometrically isomorphic or 
congruent to a relationship that will be exploited in the sequel. Parzen [16] calls 
77(5) a congruent reproducing kernel Hilbert space. 

For functional data, X and the tpj are typically functions on some continuous index 
set E. In that instance it follows from Kupresanin et al. [14] that working with is 
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equivalent to working with the space spanned by the X process: that is, 


a = ''^^ajX{tj),tj G E,aj GM.,n = 1,2,.. .j (1-6) 

under the inner product E[a&] for a,b € L\. In fact, functional canonical correlation 
can be treated directly from this latter perspective using reproducing kernel Hilbert 
space techniques along the lines of those employed in Eubank and Hsing [8]. However, 
our present formulation in terms of has certain advantages (both mathematical and 
computational) and appears to generalize more readily to deal with partial canonical 
correlation and related ideas. 

Assume now that we have two 'H-valued random variables Xi,i = 1 , 2 , whose associated 
covariance operators Si,i = 1,2, have the eigenvalue-eigenvector sequences 
from (1.3). These, in turn, produce Hilbert spaces L\,,i = 1,2, defined analogous to (1.5) 
for processes Zi{fi),i = 1,2, that are indexed by Hilbert spaces HiSi) defined as in (1.4). 
Then, the (first) canonical correlation between Zi and Z 2 is defined to be 

P^= sup Cov^(Zi(/i),Z 2 (/ 2 )). (1.7) 

ll/dlw(sp=l.i=1.2 

One can deduce from Eubank and Hsing [8] that (1.7) is well defined with the supremum 
being attained. We provide an independent verification of this fact in the next section. 
If /ii /2 are maximizing functions, then Zi{fi), ^ 2 ( 72 ) are the first canonical variables of 
the Zi and Z 2 processes, respectively. Subsequent canonical correlations and variables 
can be obtained similar to the first in an iterative process that parallels the one employed 
in the standard multivariate analysis case; see, for example, Eubank and Hsing [8]. 

A number of articles dealing with functional canonical correlation and related concepts 
have focused on the case where the Zi{fi) are restricted to have 

00 

^/2<oo, z = l,2, (1.8) 

i=i 

which has the consequence that fij't’ij € 'H. In such instances the supremum (1.7) 
need not be attained as demonstrated in Cupidon et al. [2] and Cupidon et al. [1]. 
Dauxois and Pousse [6], Dauxois et al. [4], Dauxois and Nkiet [3] and Dauxois et al. [5] 
largely ignore this issue with the consequence that their statistical applications become 
relevant only for finite dimensional covariance operators whose ranges are necessarily 
closed. Such results are, of course, already subsumed by the original Hotelling [12] work. 
In contrast. He et al. [11] impose restrictions on the cross-covariances of coefficients in 
the two processes’ Karhunen-Loeve expansions to insure that (1.8) is satisfied. Such 
restrictions are unnecessary as will be seen in the next section. 

In the present paper, we are interested not only in functional CCA but functional 
partial canonical correlation, as well. In the case of finite dimensional covariance oper¬ 
ators, the idea was proposed by Roy [17]. Given three random vectors Xi,X 2 and X 3 , 
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the partial canonical correlation of X 2 and relative to Xi was defined as the ordi¬ 
nary canonical correlation between X 2 = X 2 — Pxi -^2 and X^ = X 3 — Px^X^, where Pxi 
denotes projection onto the linear space spanned by Xi. Related work by Dauxois and 
Nkiet [3] and Dauxois et al. [5] comes with the restriction of a closed range for covari¬ 
ance operators which, again, confines statistical applications to the finite dimensional 
setting that was already treated in Roy’s original work. In Section 3, we show how the 
partial canonical correlation concept can be rigorously extended to infinite dimensions 
and functional data. 

In the next section, we set out the main ideas that are needed for rigorous treatment 
of canonical correlation and related concepts in the context of Hilbert space indexed 
processes of the basic form (1.5). The driving force behind our approach is the isometry 
that exists between the and 7^(5') spaces. To demonstrate the utility of this analytic 
framework, we illustrate the idea with two processes in the next section and extend this 
to three processes and partial canonical correlation in Section 3. 


2. CCA 

In this section, we begin with the case of two processes and establish the properties of 
canonical correlations and variables as defined in (1.7). Most of the basic techniques that 
are needed for the three process setting of the next section are illustrated in this somewhat 
simpler scenario thereby making it the natural starting point for our exposition. 

As in Section 1, assume that we have two H-valued random variables with associated 
covariance operators Si,i= 1,2, having eigenvalue-eigenvector sequences 
From Vakhania et al. [20], it may be concluded that there are also cross-covariance 
operators 5 i 2 and S '21 defined by, for example, 

E ( Ai , / i )( X 2 , / 2 ) = (11,812/2) 

with S '21 = SI 2 for <S'i 2 the adjoint of S' 12 . 

Now we construct a new Hilbert space 

■^ 0 = lh = (fij 2 ): & = l,2,\\h\\l = J2\\fi\\n{s,) 

[ i=l 

from which we obtain the Hq indexed process 

Z(h)=Zi(/i)+Z2(/2) 


with covariance function 

Cov{Z{h), Z{h')) = Cov(Zi(/i), Zi(/[)) + Cov(Z 2 (/ 2 ), ^ 2 ( 7 ^)) 

+ C0v(Zi(/i),Z2(/')) + C0v(Zi(/[),Z2(/2)) 

= (/ij /l)w(Si) + (^2, /2 )h(S2) 

+ Cov(Zi(/i),Z 2 (/')) + Cov(Zi(/[),Z 2 (/ 2 )). 


( 2 . 1 ) 
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In order to avoid the degenerate setting where perfect prediction is possible, we impose 
the following condition. 

Assumption 2.1. There exist no (/i, / 2 ) G 'Ho such that \ Corr(Zi(/i), Z 2 {f 2 ))\ = 1- 

The cross-covariance terms in (2.1) can be characterized as deriving from operators 
between H(S'i) and H(-S' 2 ). To see this, define the functional 

//,(/l)=C0v(Zi(/i),Z2(/2)) 

on H(S'i). Clearly, If^ is linear since covariance is bilinear and, e.g., Zi{afi + a'f{) = 
aZi{fi) + a'Zi{f[) for any scalars a, a' and any /i, /{ € H(S'i). Also, by the Cauchy- 
Schwarz inequality, 

\lfAh)\ < v/VarZi(/i) VarZ2(/2) = ||/i||«(soII/2|Ih(5.). 

Thus, If^ is a bounded linear functional on H(S'i) and by the Riesz representation theorem 
there is a bounded operator C 12 :H(S' 2 ) —t H(S'i) satisfying 

Cov(Zi(/i), Z2(/2)) = (/i,C'i2/2)-h(Si)- ( 2 - 2 ) 

There is also a bounded operator C 21 :H(5'i) H(S' 2 ) with C 21 = C* 2 , which satisfies 

Cov(Zi(/i),Z2(/2)) = (C'2 i/i,/2)h(S2)- 

Proposition 2.1. Under Assumption 2.1, 11(71211 = ||(72i|| < 1- 

Proof. By the definition of the operator norm, we have 

||(7i 2|P= sup ||(7 i2/2||«(5^)- 

/2eW(S2),||/2||H(S2)=l 


An application of the Cauchy-Schwarz inequality produces 
|Cov(Zi(/i),Z 2 (/ 2 ))| = |(/l,Ci 2 / 2 )«(S,)l 

< x/VarZi(/i)VarZ2(/2) 

= II/i||w(Si)||/2||w(S2) 

with the strict inequality coming from Assumption 2.1. Now take fi = ( 712 / 2 - D 

The operators C 12 and S '12 are, of course, related as we now explain. For this purpose, 
define 

{ 00 OQ 

k- /. = E = E < 00 

i=i i=i 


z = l,2. 
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Then, Si is an isometric mapping from 'H{Si) onto 'H{Si)\ that is, 'H{Si) = S~^'H{Si). 
This leads us to the following lemma. 

Lemma 2.1. S 12 is an operator from 'H{S 2 ) into 'H{Si) with |lS'i 2 || < 1. 

Proof. For any /2 £ 'H{S 2 ) and /i G H{Si) 


Cov(Zi(/i),Z2(52/2)) = 

'i'J 

= ^ Miflif2j{4>li: Si24>2o)h(Si) 

= (/l,'S'l2/2)w(Si)- 


Now use the Cauchy-Schwarz inequality and ||S' 2 / 2 ||?^(S 2 ) = II/2|I'H(S2)' 

Lemma 2.1 provides the means to characterize Ci 2 . Specifically, observe that 


Cov(Zi(/i), Z2(5'2/2)) — {fl,Si 2 f 2 )-H{Si) 

= {fuS,2Sf^S2f2)nis,) 

= {fl,Ci2S2f2)'H{Si)- 

In addition, the fact that S '12 is compact on H along with an argument similar to that 
of Lemma 2.1 reveals that C 12 is the limit of a sequence of finite dimensional operators. 
We summarize these findings as follows. 

Theorem 2.1. C 12 = Si 2 Sf^ is a compact operator from 'H{S 2 ) into 'H(S'i). 

For h G "Hoi define Qh = (/i + C 12 / 2 , /2 + ^^ 21 / 1 )- It will be convenient to write this in 
matrix form as 


Qh = 


I 

012 

'h 

021 

I 

h. 


(2.3) 


with the convention that the resulting vector is viewed as an element of Hq. Observe 
that 


GoYiZ{h),Z{h')) = {h,Qh')^. 


This leads to the following proposition. 
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Proposition 2.2. invertible with inverse defined by 

Q-\h) = (Cri' 2/1 - C, 2 Cf^\h, C^ 2 \f 2 - C 2 iC^r\h), (2.4) 

where h = (/i,/ 2 ) G Tdo and Cu.k = I - CikCki = {I - CtkCki)* for i,k = 1 , 2 ,i^ k. 
Analogous to (2.3), (2.4) will also be expressed as 


011 2 *^ 12022.1 


7 r 

^21'-'ii,2 ^22.1 


.fn 


Proof. The form of the inverse as stated in (2.4) follows directly once we have shown 
all the relevant inverse operators exist. Thus, let us concentrate on the latter task. 

We can write Q = I — T with 


Th={-Ci2f2,-C2lfl) 


0 C12 


7r 

(721 0 


./a 


Then 


<l|C’l2|P|||/2||?,(s,) + l|C'2lf||/l||?,(5,) 

= l|C'l2f[||/l||?,(S,) + ll/2||?,(5.)] 

= \\Cl2\nh\\l 
<\\h\\l 

by Proposition 2.1. Theorem 4.40 of Rynne and Youngson [18] now has the consequence 
that I — T = Q is invertible. 

To complete the proof, we need to show that Cii ,2 and C 22.1 are invertible. This again 
follows from Theorem 4.40 of Rynne and Youngson [18] because Cii ,2 = I — Ci 2 C 2 i with 
111^2111 = 111^1211 < 1 from Proposition 2.1. □ 


Now define 

, h G 'H{Sf),i = 1,2, l|hl]^(Q) = \\Q-^/^h\\l < ooj . 
The next proposition follows immediately from this definition. 

Proposition 2.3. ’H(Q) congruent to 

L\ = {Z{h): h G no, ll^(/i)lli| := Var(Z(h)) < 00 } 
under the mapping 'I'(/i) = Z{Q~^h). 


n{Q) = {h-. h = Q 
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With Proposition 2.3 in hand we can now give our formulation of CCA. Specially, we 
seek elements ft € H-iSi) of unit norm that maximize | Cov{Zi{fi), Z 2 {f 2 ))\- But 

Cov(Zi(/i), Z 2 (/ 2 )) = Cov(Z(/i, 0), Z{0, h)) = (q 

which leads to the conclusion that it is equivalent to hnd fi £'H{Si) to maximize the 
right-hand side of this last expression. 

The analysis from this point is driven by the results of Sunder [19] as described in 
Section 4. For that purpose, we decompose 'H(Q) into a sum of the closed subspaces Mi 
and M 2 with 




'n(a\ 


Mi = < 

heniQ): h = Q 

'fl' 

0 

M2 = < 

heniQ): h = Q 

■ 0 ■ 
./2. 


■=ifi.C2ifi)ji€niSi) 

■={Cl2f2j2)j2&n{S2)Y 

Regarding Mi and M 2 , we have the following result. 

Proposition 2.4. 'H{Q) = Mi + M 2 with “+” indicating an algebraic direct sum. 

Proof. Clearly any element of Hq can be written as the sum of elements in Mi and 
M 2 . We therefore need only show that Mi fl M 2 = {0}. Thus, suppose there exist fi G 
'H{Si),i = 1,2, such that ifi,C 2 ifi) = (C'i 2 / 2 ,/ 2 )- Then 


Var(2'i(/i)) — (/i,/i)«(Si) — (/i, C'i 2 / 2 )w(Si) 


and 


Var(Z2(/2)) — (/2,/2)-H(S2) — {h,C2lfl)-H{S2) — {Cl2f2,fl)n{Si)- 

But, these relations have the consequence that | Corr(Zi(/i), Z 2 (/ 2 ))| = 1 which contra¬ 
dicts Assumption 2.1. □ 

To relate Proposition 2.4 to Sunder’s scheme in the Appendix, let Li = Mi and L 2 = 
M 2 n Ml in Theorem A.l. Then, for hi = Q[ g^] S Mi and h 2 = Q[j^] € M 2 , the first 
canonical correlation satisfies 

P= sup \{hi,h 2 )n(Q)\= sup \{hi,Bh 2 )n(Q)\ 

\\hi\\n(Q)=l,i=l,2 \\hl\\n(Q) = l,fh2 + Bh2\\-H(Q) = l 

< sup \\Bh2\\-HiQ) 

h 2 ^L 2 

ll^2-|-S/l2||-M(Q)=l 
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for B = PLi\M 2 iPL 2 \M 2 ) Taking hi = Bh 2 /\\Bh 2 \\u{Q), we see that the bound is at¬ 
tainable and holds with equality. Thus, we have shown that p is obtained by maximizing 
\\Bh 2 \\-H{Q) subject to 

\\Bh 2 + ^ 2 ||«(Q) = (^ 2 , (/ -I- B*B)h 2 )-^^Q^ = 1. 

The operator B*B is compact as a result of Theorem 2.1 and Theorem 2.2 below. In 
addition, / + B*B is self-adjoint, positive, invertible and has a self-adjoint square-root 
(/ -I- B*BY^^. We can therefore work with /12 = (/ -I- B*BY^^h 2 and maximize 

\\Bh 2 \\niQ) = \\BiI + B*B)-^n'^\\^^Q^ 

subject to /12 € L 2 and = 1. The maximizer is the eigenvector for the largest 

eigenvalue of (/ -I- B*B)~^/^B*B{I -|- B*B)~^^^. Some algebra reveals that the resulting 
eigenvalue problem is equivalent to finding a vector /12 € L 2 with ||h 2 ||^(Q^ = 1 such that 

B*Bh 2 = a^h 2 (2.5) 

in which case p = a/\/l + . 

Now suppose that h 2 £ L 2 is any vector that satisfies (2.5). Its Mi component is 
Bh 2 and its M 2 component is Bh 2 + h 2 . These correspond to the canonical variables 
'^{Bh 2 /a) and 'i>{(h 2 + Bh 2 )/\/l + a'^) of the Zi and Z 2 spaces, respectively. 

In combination Corollaries A.2 and A.4 from the Appendix give us the desired char¬ 
acterization for B*B: namely, 

Theorem 2.2. Aor /i= (0,/2) G L2,B*B{0,f2) = iO,C 2 iCi 2 C^^\f 2 ) ■ 

An application of Proposition A.l from the Appendix now reveals that the conclusion 
of Theorem 2.2 can be restated as B*B{ 0 ,f 2 ) = (0, C 2 iC'i 2 / 2 ) for some /2 G ^.{ 82 ) and 
the eigenvalue problem (2.5) is equivalent to C 21 C 12/2 = a^C' 22 .i /2 or 

C'2lC'i2/2 = P^/2- 

By interchanging the roles of Mi and M2 it follows that the optimal choice for fi is the 
eigenvector corresponding to the same eigenvalue p^ of C'i 2 C' 2 i. Thus, p is the largest sin¬ 
gular value of ( 721 , /i, /2 are its right and left hand singular functions and Zi(/i), ^ 2 ( 72 ) 
are the corresponding canonical variables. More generally, a similar analysis reveals that 
the collection of all such singular values gives rise to a sequence of canonical correlations 
that correspond to canonical variable pairs with maximum possible correlation subject 
to being uncorrelated with previous pairs in the sequence. 

We conclude this section with examples that illustrate some of the features of our CCA 
formulation. 

Example 2 . 1 . Suppose that and 82 are full-rank, finite-dimensional matrices. Then, 
( 7 i 2 = *S'i 25 '^^ and C 21 = 82181 ^ so that finding eigenvalues and eigenvectors for C 21 C 12 
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_1 /o _1 /o 

is equivalent to the singular value decomposition of ' S 12 S 2 ^ which, in turn, is 
equivalent to Hotelling’s classic solution for the finite dimensional case as established in 
Kshirsagar [13]. 

Example 2.2. Functional data analysis generally focuses on the case where the Xi 
are random element of L^[0,1]; that is, the set of square integrable function on the 
interval [0,1]. One assumes the Xi admit point-wise representations as the continuous 
time stochastic processes {Xi{t,uj)-. t S [0,1], w S O},i = 1,2. Inference is then based on 
the linear combinations described in ( 1 . 6 ). 

The (assumed continuous) process covariance kernels are 

00 

Ki{t,t') = Gov{Xi{t),X,{t')) ='^Xij(j)ij{t)cj),j{t') 

i=i 

with the {Xij,4>ij),j = l,...,i = l, 2 , being the eigenvalues and eigenvectors of the L^[0, 1 ] 
integral operators defined by 


{SJ){t)= fis)K,it,s)ds. 

Jo 

The RKHS that is congruent to is H^Si). 

In the case of two processes, we also have the cross-covariance kernels 


Kl2{tl,t2) = Cov{Xi{ti),X2{t2)) 

= COY{X 2 {t 2 ), Xi(ti)) 

= R^2i(0,H)- 

From Eubank and Hsing [ 8 ], we know that Ki2{-,t2) £ 7 ^(S'i), and i£i2(H,-) £ '^■{82); so, 
if fz = 1 >^ijfij4'ij £ y-iSi), 

{Rl2f2){t) = {Ki2{t, •), /2(-))'H(S2) 

defines a bounded operator from 'H{S 2 ) into 'H{Si) with the property that 


Cov(Zi(/i),Z2(/2)) = V V/lj/2fc f Ki2{s,t)(j3ij{s)(j32kit)dsdt 

k j -^0 

= {flT Rl 2 f 2 }n{Si)- 

Therefore, R 12 = 6*12 and our CCA formulation coincides with that in Eubank and Hsing 

[ 8 ]. 


Example 2.3. The developments in this section suggest a new approach to estimation 
in the functional CCA setting of the previous example. The idea stems from (2.2) which 
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has the consequence that 

Cov{Zi((j)u), Z 2 {(l) 2 j)) = (2-6) 

It follows from Hansen [10] that a singular value decomposition of 

^771 = {{4'l^,Cl2(|)2j)HiSl)}^J=l:m (2-7) 

for some finite integer m will produce singular values that approximate the singular 
values for the operator C 12 and that the singular vectors provide coefficients for linear 
combinations of the (l)ij that approximate its singular functions. The only question is how 
to estimate the inner products in (2.7). The answer is revealed by examining the left hand 
of (2.6). The realized values of the Zi{(j)ij),j = 1,... ,m can be estimated directly using 
the scores one obtains from a principal components analysis of functional data. Thus, 
their sample covariance matrix provides an obvious choice for an estimator of (2.7). 

Suppose we have observed sample path pairs {xij{-),X 2 ji-)),j = 1,... ,n. The resulting 
estimation algorithm can then be summarized as follows. 

1. Carry out a principal components analysis of the Xij,j = l,...,n to obtain the 
estimated eigenfunctions = 1,..., m and n x m score matrices 

Wi = {('^7j,2^ifc(’))}fc=l:7l,j=l:m 

for i = 1,2. Let Am be the m x m sample cross covariance matrix obtained from 
ITi^and W 2 . 

2. If Am = UDV’^ for U = [ui,. ■ ■ ,Um],y = [vi,...,Vm] and £> = diag(di,..., d^) is 
the singular value decomposition of Am, the *th canonical correlation is estimated 
by di and the corresponding canonical weight functions by uf [cj) 2 i, ■ ■ ■, (l> 2 m] and 

^7 [^11 7 ■ • ■ 7 0l77l] ■ 

A simple numerical example will be used to illustrate this estimation scheme. The 
setting is that of Eubank and Hsing [8] where the two processes are 

20 

Xy) = ZijV2sm{jnt), 

i=i 

20 

X2{t) = (Zii +Z2i)sin(7tf) + y^j~^/^Z2jV2sin(j7Ts), 

1=2 

for t S [0,1] and the Zij i.i.d. standard normal random variables. In this instance, there 
is only one nonzero canonical correlation: namely, pi = l/-\/2 = 0.707. 

We sampled n process pairs at 100 equally spaced points and conducted principal 
components analysis on the resulting data using the function pda. f d from the fda package 
in R retaining 9 components (or harmonics) for both processes. This basic experiment 
was then replicated 100 times. For samples of size n = 250, the observed means (standard 
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deviations) of the first two sample canonical correlations were 0.7248 (0.0818) and 0.0777 
(0.0122), respectively. For samples of size n = 500, the means (standard deviations) were 
0.7147 (0.0591) and 0.055 (0.0095). 

This rather crude implementation suffices for the present expository purposes. How¬ 
ever, for use in practice one should at least employ consistent estimators for the eigen¬ 
functions such as those studied in Yao et al. [21] and Hall et al. [9]. 


3. PCCA 

A similar approach to that of the previous section can be used to address the PCCA 
setting. There are now three H-valued random variables Xi,i = 1,2,3, with associated 
covariance operators Si, i = 1,2, 3. As in Section 2, we can also define the cross-covariance 
operators S' 12 , S 13 , S 23 and their adjoints. 

For i = 1,2,3, the Hilbert spaces spanned by the process Zi{fi) indexed by their 
congruent Hilbert spaces H^Si) are defined as in (1.5) and (1.4). Hence, by the Riesz 
representation theorem, there are bounded operators Cy •.'H(Sj) ~^'H{Si) satisfying 


Cov{Zi(fi), — {fi,Cijfj)fi^(^Si) 


for i,j = 1,2,3 and j. Also, we have that Cij = C*^. 

We now construct the new Hilbert space 

no = \h= (/i,/2,/3): h G n{Si),i = 1,2,3, \\h\\l = Y, WMnis,) < oo 

[ i=l 

Then, our corresponding Hq indexed process is Z{h) = Zi{fi). 

As in the previous section we need to rule out the case where perfect prediction is 
possible. For this purpose, we require that Assumption 2.1 holds for both of the process 
pairs Z\, Z 2 and Z \, Z 3 as well as the following. 

Assumption 3 . 1 . There exist no /2 G 'H{S2) or /a G HiSs} such that 
|Corr(Z2(/2) - Pz,Z2if2),Z3ih) - Pz.Z^imi = 1- 
For hGHo, define 

Qh = ifi + <^ 12/2 + Cis/a, C 2 ifi + f 2 + ^ 23 / 3 ) 1 ^ 31/1 + 1 ^ 32/2 + /s) 
which we will express in the matrix form 



I C 12 Ci 3 


'h 

Qh = 

C 21 I C 23 


/2 


Cai C 32 I 


J^. 
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We then see that 


CoYiZ{h),Z{h')) = {h,Qh')^. 

Our next result gives the three process parallel of Proposition 2.2. 

Proposition 3.1. Let E = [Oi 2 Cia], F = [g-], D = ] and G = - 

y)£)l/2 


v = 


0 

“*-^33.1 W32 - C^3lC.'12jC.'22.1 


“^22.1 W23 - (-.2Wi3j033 

0 


Then, 


Q-' 


I + EG-^F -EG-^ 
G-^F G-i 


(3.1) 


(3.2) 


From Proposition 2.2, we know that G 22.1 and G 33.1 are invertible. The result will 
therefore follow if we can show that the norm of V in (3.1) is strictly less than unity. 
This is a consequence of the next two lemmas and Theorem 4.40 of Rynne and Youngson 
[18]. 

Lemma 3.1. The projection of Z 2 {f 2 ) onto L\ is Zi(Ci 2 f 2 ) and the projection of 
Zsifs) onto is Zi^Gisfs). 

Proof. If Pzi^ 2 (/ 2 ) denotes the projection, it must satisfy 

Cov(Zi(/i),Pz,Z 2 (/ 2 ))=Cov(Zi(/i),Z 2 (/ 2 )) 
for every /i G Since there is some f\ G 'H(-S'i) such that Pzi^ 2 (/ 2 ) = ■^i(/i), 

Cov(Zi(/i), Z 2 (/ 2 )) = (/l, Gi 2 / 2 )-H(Si) 

= Cov(Zi(/i),Zi(A)) 

= (/1)/i)h(Si)- 

Therefore, f\ = G 12 f 2 - The second half of the lemma is proved similarly. □ 

Lemma 3.2. \\C-^^^\G 2 z - C 2 iC^z)C^ll^^\\u{s,) < 1- 
Proof. First, observe that by Lemma 3.1 and Assumption 3.1 


|C0v(Z2(/2) - ^l(Gi2/2),^3(/3) " ^A^is/s))! 
= I(/2,G23/3)h(S2) “ (A, 1 ^ 21 ^ 13 / 3 )^( 52 ) I 
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< (Var(Z2(/2) - ^i(Ci2/2)))^^'(Var(Z3(/3) - ^i(Ci3/3)))'^' 

= (/ 2 , C'22.i/2)^(S2)(/3, C'33.i/3)^(S3) 

= IIC'22.1/2 ll«(S2) IIC'33.1/311^(33) • 

Now, let /2 = C'22,^1/2 and /3 = C'33^j/3 to obtain 

{hC-li\C^:i-C2iCn)C-li^h)^^s.^ < ||/2||«(S2)II/3|Ih(S3)- 

Finally, taking /2 = <^22^1^(1^23 — <^2i<^i3)<^33^i^/3 completes the proof. 


Now define 


H(Q) 



7r 

II 

/2 


/3 


j.&n{s,),i = i,2,Uh\\H(Q) = 


\\Q-^'^h\\l< 



Then, as in Proposition 2 . 3 , we have 


□ 


Proposition 3 . 2 . T-L{Q) is congruent to 

L| = {Z{h): h € Ho, ||^(/i)|li| = Yar{Z{h)) < 00} 
under the mapping '^{h) = Z(Q~^h). 

For the PCCA formulation, we wish to find /2 £ 'H{S2) and fs G 'HiSs) to maximize 
|Cov(Z2(/2) - ^i(Ci 2 / 2 ),^ 3 (/ 3 ) " ^l(Cl 3 / 3 ))|. 


Since 


C0v(Z2(/2) - Zi(Ci2/2),^3(/3) " Z^iC.sh)) 
= Cov{Z (—<712/2, /2, 0 ),Z (—<713/3, 0 , fs)), 
it suffices to find /2 G ^{82) and fs G H-iSs) to maximize 


(q 

—< 712/2 

/2 

,Q 

—< 713/3 

0 

) 

\ 

0 


/a 

^ -HiQ) 


Again, we apply the results of Sunder described in Section 4 . For this purpose, write 
RiQ) = Ml + M2 + M3 with 
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M2 = {hGn{Q)-. h = Q 


and 


M3={ heniQ): h = Q 


0 

0 

fs 


■— ((^12/2,/2, <^32/2) 


(C 13 / 3 ) <^ 23 / 3 ,/a) / • 


h2 — Q 

— 012/2 

/2 

G M 2 — Pl„M2 

and h 3 = Q 

— 013/3 

0 


0 



fs 


An argument similar to that for Proposition 2.4 produces the following proposition. 
Proposition 3.3. TL{Q) = Mi + M 2 +M 3 with “+” indicating an algebraic direct sum. 
Now let Li = Ml, L 2 = M 2 n Mi,L 3 = M 3 n M^ fl M^ and take 

G M3 — 


with ||hi||^(Q) = 1, i = 2,3. Then, arguing as in the previous section we see that the first 
partial canonical correlation can be characterized as 

P= sup l(^2,h3)«(Q)| 

/l 2 £ ^2 — Pl X M 2 , , ^3 ^ M-^ — Pl I M 3 
ll^illK(Q)—2,3 

= _ ^ sup WBhWniQ) 

/laeLa, 11^3+5^3 IIk(Q) I—1 

for B = Pl 2 \M 3 {Pl 3 \M 3 )~^■ The bound is attained by taking /12 = i3h3/||i3h3||7^(Q) in 
which case the first partial canonical correlation is a/vT+7? with the largest eigen¬ 
value of B* B. If ^3 is an eigenvector corresponding to , the partial canonical variable 
for the Z 2 space is 4'( 43 / 13 / 0 ) and the partial canonical variable for the Z 3 space is 

4'((/i3 -I- Bh3)/Vl + o^). 

Now, through Corollaries A.6 and A.7, we finally obtain 

Theorem 3.1. For h = {0,0, f 3 ) G L 3 , 

B*Bh = (0,0, (6-32 - C3iC'i2)C'22\(C'23 - C 21 C'i3)Co-73) • 

This result in combination with Corollary A.5 reveals that partial canonical correla¬ 
tions are the singular values of the operator C 33 ^(^(C 32 — C' 3 iCi 2 )C/ 2 ^(^. 


Example 3.1. The basic computational algorithm from Example 2.3 can be adapted 
for computing sample partial canonical correlations. One now carries out principal com¬ 
ponents analysis of the data from all three processes and then regresses the scores for 
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the X 2 ,X 3 process data onto the scores from the Xi sample paths. The Example 2.3 
computational scheme is then applied to the residuals from the two regression analyses. 

To illustrate the idea, consider again the two processes from Example 2.3. Sample paths 
were generated as before except that in each instance we subtracted a term PZ cos(7Ts) 
with Z a standard normal random variable and (3 equal to 1 for the Xi process and 2 
for the X 2 process. The only nonzero partial canonical correlation in this case is again 
The first two partial canonical correlations obtained from an empirical experiment 
using the same parameters as in Example 2.3 had means (standard deviations) of 0.7107 
(0.0875) and 0.0818 (0.0157) for samples of size 250 and 0.7141 (0.0599) and 0.0553 
(0.0089) for samples of size 500. 

4. Summary 

We have developed a framework that can be used to study the correlation properties 
of groups of Hilbert space indexed stochastic processes. Our applications have been re¬ 
stricted to groups of size two or three; however, it is clear that similar analyses are 
possible with any finite number of processes. Eor example, the partial canonical correla¬ 
tion work of Section 3 extends in principle to examination of pairs of residual processes 
after correcting for projections onto several other processes. 

We note in passing that it has been assumed that all the "H-valued random variables 
take values in the same Hilbert space. The extension to where some or all of the variables 
produce elements of different Hilbert spaces incurs some additional notational expense 
but is otherwise straightforward. 


Technical Appendix 

In this Appendix, we collect some of the mathematical details that were needed for 
our main results. In particular, the developments in Sunder [19] play a pivotal role in 
Sections 2-3. Thus, we first summarize the key aspects of that work that were employed 
in the paper. 

Assume that a Hilbert space H, can be written as the algebraic direct sum of n closed 
subspaces Mi,...,M„. That is. 


2=1 


where Mi fl Now, for l<k<n define 


Lk 


^k-l 


n ^M, 


Then, Lk-LMi, for i = 1,... ,k — 1, and by construction = SiLi^ = 

l,...,n. 
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Let Pm^ and Pl^. be the orthogonal projection operators onto Mk and Lfe, respectively. 
Then, for 1 < fc < n and 1 < j < fc < n we define the restriction of Pl^ to Mk by PLj\Mk^ = 
PhjX for X € Mk and use PMk\Ljy = Pm^D fOT D ^ Pj to indicate the restriction of Pm^ 
to Lj. Sunder [19] establishes the following relationship between the Mk and Lk- 

Theorem A.l. For x G Mk, we can write Mk as 

Mk = {{Pli\M,,X, . . . , PL^\MkX, 0 , ... , 0 )} 

{ (^Li |Mj 5 {PLk\Mf, ) PLk\Mh X, ■ ■ ■ , PLk\Mk 2^, 0, . . . , 0)} 

{ (^Li |Lfc -^5 ■ ■ • ; -^Lfc_i |Lfc ^5 

where z = Pl^Im^x G Lk and = PLj\Mk{PLk\Mk)~^ for l<j <k<n. 

Theorem A.l has the consequence that problems involving optimization over Mk can 
instead be formulated in terms of equivalent problems on Lk which is how it is applied 
in Sections 2-3. 

We next turn to the proof of Theorem 2.2. This is accomplished via the following 
proposition and its corollaries. 

Proposition A.l. If h = {Ci 2 f 2 ,f 2 ) G M 2 , then PL^\M.^h = {Ci 2 f 2 ,C 2 iCi 2 f 2 ) and 
PL2\M2h = {I — PiilMa)^ = ( 0 ; ^'22.1/2)- 


Proof. Let hi = {fi,C 2 ifi) G Mi =Li. Then, 

{PLi\M2p2, ^l)-H(Q) = (^2, ^l)'H(Q) 
for every hi G Mi. Writing Plpm 2^2 = (A*,^ 21 / 1 *) leads to 

(^Li|M2^2,^i)w(Q) = {{fl,C2lfi),{fl,0))o = (/A A)w(Si) 

= ((C'i2/2,/2),^i)?^(Q) = ((<^12/2, A), (Ai 0 ))o 
= ((^12/2, A)w(Si) 

for every fiGPiSi). So, ff = < 712 / 2 . □ 

Corollary A.l. Ifh= (0, A) € L 2 , {PL^iM 2 )~^h = (<712(722^/2, Ci^hh)- 
Corollary A. 2. For h = (0, A) G L 2 , we have 

Bh := Pli\M2 iPL2\M2) = {Cl2C22.lh,C2lCi2C22.lh) ■ 

Corollary A.3. Let ft. = (0, A), /i'= (0, A) G ^ 2 - Then, 

(ft, ft )-u{Q) ~ ((^I A)i Q /2))o ~ (Aj (^22.i/2)'H(S2)’ 
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With a little extra effort we also obtain the following corollary. 

Corollary A.4. S*(/i,C 2 i/i) = (0,C2i/i). 

Proof. For h = (/2,C'2i/i) G Mi = Li and h = (0,/2) G L2, 

{h,Bh)n{Q) = {Q~^h,Bh)^ 

= ((/l, 0 ), (C'i2 C22\/2, C2 iC'i2 C'22\/2 ))o 
= (/li C'i 2 C' 22 .i/ 2 )w(Si) = (C' 22 .iC' 2 i/ij 72 ) 7 ^( 52 ) 

= {B*h, h)^f^Qy 

An application of Corollary A.3 completes the proof. □ 

Finally, we give the details for proving Theorem 3.1. Analogous to the proof of Theo¬ 
rem 2.2, the steps are broken down into a proposition and its subsequent corollaries. 

Proposition A. 2 . If h = (C'i2/2j/2j^32/2), Pli\M2^ = (<^12/2,^21(712/2,^31(712/2) 
and Pl2\M2^ = (7 — = (O7 022.1/2, {C32 — C'3iC'i2)/2) • 

Proof. For hi = (/i,(721/1,Csi/i) G Mi = Ti, we have the relation 

( 7 li|M 2 ^: ^l)-H(Q) = {h, hl)H(Q)- 

Writing =(/*,<721/1*,(73i/i*) leads to 

(Pli|M 2^: ^i)w(Q) = ((/*i( 72 i/i ,( 73 i//), (/l,0,0))p 
= (/* ;/l)-H(Si) 

= {h,hi)u[Q) 

= ((Ci 2/2,/2, 6-32/2), (/l,0,0))p 

= (612/2,/l)-H(Si) 

for every fi G 'H(S'i) with * = 1,2. So // = 612/2- □ 

For subsequent notational convenience, let 

60 = 633,1 — (632 — 631612)622^1(623 — 621613). 


Corollary A.5. If h — {013/3,023/3, fs), PLilM^h — (613/3,621613/3,631613/3), 
Pl2|M3^ = (0, (623 — 62i6i3)/3, (633.1 — 6o)/3 ) and P^^M^h = (0,0,60/3)- 
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Proof. For /12 = (0, ( 722 . 1 / 2 , {C 32 — C 3 iCi 2 )f 2 ) S L 2 and h G M 3 , we have the relation 
^ 2 )«(Q) = {h, ^ 2 )«(Q) ■ If we write = (0, ( 722 . 1/2 I (C '32 — (73i(712)/2 ), 

then 

{PL2\M3h,h2)-HiQ) 

= ((0, C 22 . 1 / 2 *, (C 32 - C'3iC12)/*), (-C 12 / 2 , / 2 ,0))o 
= ((722.i//,/2)-h(S2) 

= {h,h2)HiQ) 

= ((O^Oi/s), (0,(722.1/2, ((732 — (73 i(7i2)/2))o 
= (/a, ((732 - (73i(7i2)/2)-h(S3) 

= (((723 - (72i(7i3)/3, h )' H { S 2 )' 

So, / 2 * = ^ 2211(^23 - C 2 lC'i 3 )/ 3 . □ 

Corollary A.6. Forh={0,0j3)eL3, 

Bh = (0, (6-23 - C2iC'i3)Co-73, (C 32 - C'3iCi2)C2-2\(C23 - ^21 Ci3)C'o-73) ■ 
Corollary A.7. //= (0, ( 722 . 1 / 2 , ((732 - (73iCi2)/2) G L 2 , then 

B*h=(0,0,(C32-C3lCi2)/2). 

Proof. For h = (0, C 22 . 1 / 2 , {C 32 - ( 73 i( 7 i 2 )/ 2 ) G 1^2 and /13 = (0,0, h) G L 3 , 

{Bh 3 ,h)-H(Q) = {Bh^,Q 

= (I?^3, (—(7i2/2, /2 , 0 ))q 

= ((C23-C2iCi3)Co-73,/2)«(5,) 

= (Co- 73 , (C32-C3iCi2)/2)„(53) 

= (ii3, -B 

= {Q-^h,B*h)^ 

= (([C’2iC' 2-2\(C23 - C'2lCi3) - Ci3]Co-73, 

- C'2-2\(C23 - C2iCi3)Co-73, Co- 73), B*/l)o. □ 
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